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Abstract 

We consider five different classes of multivariate statistical problems iden¬ 
tified by James (1964). Each of these problems is related to the eigenvalues 
of E~^H where H and E are proportional to high-dimensional Wishart ma¬ 
trices. Under the null hypothesis, both Wisharts are central with identity 
covariance. Under the alternative, the non-centrality or the covariance pa¬ 
rameter of H has a single eigenvalue, a spike, that stands alone. When the 
spike is larger than a case-specific phase transition threshold, one of the 
eigenvalues of E~^H separates from the bulk. This makes the alternative 
easily detectable, so that reasonable statistical tests have asymptotic power 
one. In contrast, when the spike is sub-critical, that is lies below the thresh¬ 
old, none of the eigenvalues separates from the bulk, which makes the testing 
problem more interesting from the statistical perspective. In such cases, we 
show that the log likelihood ratio processes parameterized by the value of 
the sub-critical spike converge to Gaussian processes with logarithmic corre¬ 
lation. We use this result to derive the asymptotic power envelopes for tests 
for the presence of a spike in the data representing each of the five cases in 
James’ classification. 


1 Introduction 

High-dimensional mnltivariate models and methods, snch as regression, principal 
components, and canonical correlation analysis, have become snbject of mnch re¬ 
cent research. In contrast to the classical framework where the dimensionality is 
hxed, the cnrrent focns is on sitnations where the dimensionality diverges to inhn- 
ity together with the sample size. In this context, spiked models that deviate from 
a reference model along a small hxed nnmber of nnknown directions have proven 
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to be a fruitful research tool. A basic statistical question that arises in the analysis 
of spiked models is how to test for the presence of spikes in the data. 

James (1964) arranges multivariate statistical problems in five different groups 
with broadly similar features. His classihcation corresponds to the hve types of 
the hypergeometric functions pFq that often occur in multivariate distributions. In 
this paper, we describe spiked models that represent each of James’ classes, and 
derive the asymptotic behavior of the corresponding likelihood ratios, that is the 
ratios of the joint densities of the relevant data under the alternative hypothesis, 
which assumes the presence of the spikes, to that under the null of no spikes. 
In each of the cases, the relevant data consist of the maximal invariant statistic 
represented by eigenvalues of a large random matrix. We consider the asymptotic 
regime where the dimensionality of the data and the number of observations go to 
inhnity proportionally. 

We hnd that the measures corresponding to the joint distributions of the eigen¬ 
values under the alternative hypothesis and under the null are mutually contiguous 
when the valnes of the spikes are below a phase transition threshold. The valne 
of the threshold depends on the problem’s type. Furthermore, we hnd that the 
log likelihood ratio processes parametrized by the valnes of the spikes are asymp¬ 
totically Gaussian, with logarithmic mean and antocovariance fnnctions. These 
hndings allow us to compute the asymptotic power envelopes for the tests for the 
presence of spikes in hve mnltivariate models representing each of James’ classes. 

Our analysis is based on the classical results that assume Gaussianity. All the 
likelihood ratios that we study correspond to the joint densities of the solutions to 
the basic eqnation of classical mnltivariate statistics, 

det {H - XE) = 0, (1) 

where El and E are proportional to Wishart matrices. 

The hve diherent cases that we stndy are: 1) is a known deterministic matrix, 
and is a central Wishart matrix with covariance eqnal to a low-rank perturbation 
of E] 2) both E and H are central Wisharts with nnknown covariance matrices 
that diher by a matrix of low rank; 3) i? is a known deterministic matrix, and 
H IS a. non-central Wishart matrix with covariance equal to E and with a low- 
rank non-centrality; 4) is a central Wishart matrix, while FT is a non-central 
one with the same unknown covariance matrix and with a low-rank non-centrality; 
5) i? is a central Wishart, while FJ is a non-central Wishart conditionally on a 
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random low-rank non-centrality parameter. These five cases can be linked via 
sufficiency and invariance arguments to a principal components problem, a signal 
detection problem, hypotheses testing in multivariate regression with known and 
with unknown error covariance, and a canonical correlation problem, respectively. 
We briefly discuss the links in the next section of the paper. 

The main steps of our asymptotic analysis are the same for all the five cases. 
The likelihood ratios have explicit forms that involve hypergeometric functions 
of two high-dimensional matrix arguments. However, the low-rank nature of the 
alternatives that we consider ensures that one of the arguments have low rank. For 
tractability, we focus on the special case of rank-one alternatives. In such case, 
using the recent result of Dharmawansa and Johnstone (2014), we represent the 
hypergeometric function of two high-dimensional matrix arguments in the form of 
a contour integral that involves a scalar hypergeometric function of the same type. 
Then we deform the contour of integration so that the integral becomes amenable 
to Laplace approximation analysis (see Olver (1997), chapter 4). 

Using the Laplace approximation technique, we show that the log likelihood 
ratios are asymptotically equivalent to random quadratic functions of the spike 
parameters. The randomness in the quadratic function enters via a linear spectral 
statistic of a large random matrix of either sample covariance or F-ratio type. 
Using CLT for the linear spectral statistics, established by Bai and Silverstein 
(2004) for the sample-covariance-type random matrices and by Zheng (2012) for the 
F-ratio-type random matrices, we derive the asymptotic Gaussianity and obtain 
the mean and the autocovariance functions of the log likelihood ratio processes. 

The derived asymptotics of the log likelihood processes shows that the corre¬ 
sponding statistical experiments do not converge to Gaussian shift experiments. 
In other words, the experiments that consist of observing the solutions to equation 
(IT]) parameterized by the values of the spikes under the alternative hypothesis are 
not of the Locally Asymptotically Normal (LAN) type. This implies that there are 
no ready-to-use optimality results associated with LAN experiments that can be 
applied in our setting. However at the fundamental level, the derived asymptotics 
of the log likelihood ratio processes is all that is needed for the asymptotic analysis 
of the risk of the corresponding statistical decisions. 

In this paper, we use the derived asymptotics together with the Neyman- 
Pearson lemma and Le Gam’s third lemma (see van der Vaart (1998)), to find 
simple analytic expressions for the asymptotic power envelopes for the statistical 
tests of the null hypothesis of no spikes in the data. The form of the envelope is 


3 


different depending on whether both H and E in the corresponding equation ([T]) 
are Wisharts or only H is Wishart whereas E is deterministic. 

For most of the cases, as the value of the spike under the alternative increases, 
the envelope, at hrst, rises very slowly. Then, as the spike approaches the phase 
transition, the rise quickly accelerates and the envelope ‘hits’ unity at the threshold. 
However, in cases of two Wisharts and when the dimensionality is not much smaller 
than the degrees of freedom of E, the envelope rises much faster. In such cases, the 
information in all the eigenvalues of E~^H might be useful for detecting population 
spikes which he far below the phase transition threshold. 

A type of the analysis performed in this paper has been previously implemented 
in the study of the principal components case by Onatski et al (2013). Our work 
extends theirs to the remaining four cases in James’ classihcation of multivariate 
statistical problems. One of the hardest challenges in such an extension is the rig¬ 
orous implementation of the Laplace approximation step. With this goal in mind, 
we have developed asymptotic approximations to the hypergeometric functions lEi 
and 2 F 1 which are uniform in certain domains of the complex plane. 

A trivial observation that the solutions to equation ([T]) can be interpreted as 
the eigenvalues of random matrix E~^H relates our work to the vast literature on 
the spectrum of large random matrices. We refer the reader to Bai and Silverstein 
(2006) for a recent book-long treatment of the subject. Three extensively studied 
classical ensembles of random matrices are the Gaussian, Laguerre and Jacobi en¬ 
sembles (see Mehta (2004)). However, only the Laguerre and Jacobi ensembles are 
relevant for the hve scenarios for ([T]) that correspond to James’ hve-fold classihca¬ 
tion of multivariate statistical problems. This prompts us to search for a “missing” 
class in James’ system that could be linked to the Gaussian ensemble. 

Such a class is easy to obtain by taking the limit of yJfi {H — Ip) as ni ^ 00 , 
where rii and p are iJ’s degrees of freedom and dimensionality, respectively. The 
corresponding statistical problem can be called “symmetric matrix denoising”. 
Under the null hypothesis, the observations are given by apxp matrix Z/^/p with Z 
from the Gaussian Orthogonal Ensemble. Under the alternative, the observations 
are given by Z/ y/p + 4), where 4) is a deterministic symmetric matrix of low rank. 
We call this situation “case zero”, and add it to James’ classihcation. We derive 
the asymptotics of the corresponding log likelihood ratio and obtain the related 
asymptotic power envelope. 

Many existing results in the random matrix literature do not require that the 
data are Gaussian. This suggests that some results about tests for the presence 
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of the spikes in the data may remain valid without the Gaussianity. One may for 
example consider H and in ([1]) that, although have the form of sample covariance 
matrices, do not come from the underlying Gaussian distribution, and study the 
properties of the corresponding tests. We leave this line of research to the future. 

Since the explicit form of the joint distribution of the solutions to ([T]) is only 
known in the Gaussian case, it seems unlikely that one would be able to completely 
summarize the asymptotic behavior of the corresponding non-Gaussian statistical 
experiments. We hope that the results of this paper, that provide such a summary 
under the Gaussianity, can serve as a useful benchmark for the future studies that 
would relax our assumptions. 

The rest of the paper is organized as follows. In the next section, we relate 
the hve different cases of equation ([1]) to the classical multivariate statistical prob¬ 
lems representing different cells of James’ (1964) hve-fold classihcation system. In 
Section 3, we obtain explicit expressions for the likelihood ratios. Section 4 repre¬ 
sents the likelihood ratios in the form of contour integrals. Section 5 performs the 
Laplace approximation analysis. Section 6 derives the asymptotic power envelopes. 
Section 7 concludes. Technical proofs are given in the Appendix. 

2 Links to classical statistical problems 

Gase 1 corresponds to the problem of using rii i.i.d. Np (0, G) (p-dimensional 
Gaussian) observations to test the null hypothesis that the population covariance 
G equals a given matrix E. The alternative of interest is 

G = E -|- 'iIjOiIj' 

with unknown 6^ > 0 and ip, where pj is normalized so that ||E“^/^'0|| = 1. 

Without loss of generality, we may assume that E = Ip. Then under the 
null, the data are isotropic noise, whereas under the alternative, the hrst principal 
component explains a larger portion of the variation than the other principal com¬ 
ponents. We therefore label Gase 1 as the ‘principal components analysis’ (PGA) 
case. 

The null and the alternative hypotheses can be formulated in terms of the 
spectral ‘spike’ parameter 9 as 


Hq 6o = 0 and Hi : 9o = 9 > 0, 


( 2 ) 
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where 6*o is the true value of the ‘spike’. This testing problem remains invariant 
under the multiplication of the pxui data matrix from the left and from the right by 
orthogonal matrices, and under the corresponding transformation in the parameter 
space. A maximal invariant statistic consists of the solutions Ai > ... > Ap of 
equation ([1]) with H equal to the sample covariance matrix and E = T,. We restrict 
attention to the invariant tests. Therefore, the relevant data are summarized by 

Ai,..., Ap. 

Case 2 is represented by the problem of testing the equality of covariance matri¬ 
ces, n and E, corresponding to two independent p-dimensional zero-mean Gaussian 
samples of sizes ni and 712 - Throughout the paper, we shall assume that 


p < min {ni, 77 - 2 } . 


The assumption p < n 2 is made to ensure the almost sure invertibility of matrix 
£' in (II]), whereas the assumption p < rii is made to reduce the number of various 
situations which need to be considered. Such a reduction makes our exposition 
more concise. 

Returning to Case 2, the alternative hypothesis is the same as in the PGA 
case. Similar invariance considerations lead to tests based on the eigenvalues of 
the F-ratio of the sample covariance matrices. Matrix H from ([1]) equals the 
sample covariance corresponding to the observations that might contain a ‘signal’ 
responsible for the covariance spike, whereas matrix E equals the other sample 
covariance matrix. We label Case 2 as the ‘signal detection’ (SigD) case. In this 
case, we find it more convenient to work with the p solutions to the equation 

det (^F-A + =0, (3) 

which we also denote Ai > ... > Ap to make the notations as uniform across the 
different cases as possible. Note that as the number of observations in the second 
sample, ^ 2 , diverges to infinity while ni and p are held constant, equation ([3]) 
reduces to equation ([T]), F converges to E, and SigD reduces to PGA. 

Cases 3 and 4 occur in multivariate regression 

Y = Xp + e 

when the goal is to test linear restrictions on the matrix of coefficients f3. Case 3 
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corresponds to the situation where the covariance matrix E of the i.i.d. Gaussian 
rows of the error matrix £ is known. We label this case as ‘regression with known 
variance’ (REGq). Gase 4 corresponds to the unknown S, and we label it as 
‘regression with unknown variance’ (REG). 

As explained in Muirhead (1982), pp. 433-434, the problem of testing linear 
restrictions on /3 can be cast in the canonical form, where the matrix of transformed 
response variables is split into three parts, Y]*, and Yg*. Matrix Yj* is ni x p, 
where p is the number of response variables and rii is the number of restrictions. 
Under the null hypothesis, EYj^* = 0, whereas under the alternative, 

EUg* = (4) 

where 6 > 0, = 1, and ||93|| = 1. Matrices Y 2 and Yg* are {q — ni) x p 

and {T — q) x p, respectively, where q is the number of regressors and T is the 
number of observations. These matrices have, respectively, unrestricted and zero 
means under both the null and the alternative. 

For REGo, sufficiency and invariance arguments lead to tests based on the 
solutions Ai,..., Ap of ([I]) with 

H = Y*%*/ni and E = S. 

These solutions represent a multivariate analog of the difference between the sum of 
squared residuals in the restricted and unrestricted regressions. For REG, similar 
arguments lead to tests based on the p solutions Ai,..., Ap of ([3]) with 

H = Y*%*/ni and E = Y;'Y;in 2 , 

where n 2 =T—q. These solutions represent a multivariate analog of the ratio of the 
difference between the sum of squared residuals in the restricted and unrestricted 
regressions to the sum of squared residuals in the restricted regression. Note that, 
as n 2 —)■ 00 while rii and p are held constant, REG reduces to REGq. 

Gase 5 occurs in situations where the researcher would like to test for the 
independence between Gaussian vectors Xt G and yt G given zero mean 
observations with t = 1,..., rii -|-n 2 . Partition the population and sample covariance 
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matrices of the observations as 



and 



respectively. Under the nnll hypothesis, = 0. The alternative of interest is 



where the vectors of nnisance parameters i/j G MP and ip G are normalized so 
that 



The pecnliar form of the expression nnder the sqnare root is chosen so as to simplify 
varions expressions in the analysis that follow. 

The test can be based on the sample canonical correlations Ai,..., Ap, which are 
solntions to ([I]) with 



We label Case 5 as the ‘canonical correlation analysis’ (CCA) case. Remarkably, 
the sample canonical correlations also solve (jS]) with different H and U, snch that 
U is a central Wishart matrix and if is a non-central Wishart matrix conditionally 
on a random non-centrality parameter (for details, see Theorem 11.3.2 of Mnirhead 


(1982)). 


Finally, as discnssed in the Introdnction, we also consider Case 0, which we 
label as the ‘symmetric matrix denoising’ (SMD) case. Given a p x p matrix 
X = ^ + Z /where Z is a noise matrix from the Ganssian Orthogonal Ensemble 
(GOE), a researcher wonld like to make inference abont a symmetric rank-one 
“signal” matrix <F = 'ipO'ip'. Recall, that a symmetric matrix Z belongs to GOE if 
its diagonal and snb-diagonal entries are independently distribnted as 


Zii ~ N (0, 2) and Zij ~ N (0,1) ii i > j. 


The nnll and the alternative hypotheses are given by ([2]). The nnisance V’ ^ 
MP is normalized so that ll-^H = 1. The problem remains invariant nnder the 
mnltiplication of X from the left by an orthogonal matrix, and from the right by 
its transpose. A maximal invariant statistic consists of the solntions Ai,...,Ap to 
([1]) with H = X and E = Ip. We consider tests based on Ai,..., Ap. 




The SMD case can be viewed as a degenerate version of all of the above cases. 
For example, consider REGq with 

EFi* = \J ni0^p'4)\ 

so that the original valne of the spike 6 (see eqnation (|1])) is scaled by 
Snppose now that rii diverges to inhnity while p is held constant. Then, by a 
Central Limit Theorem (CLT), 

- Ip = Z/y/nl + ^Jp/niTjOri' + op ^ (5) 

where Z belongs to GOE and p = On the other hand, eqnation ([1]) is 

eqnivalent to 

det _ Q_ ^ 5 ) 

Mnltiplying it by \/n^Jp and nsing ([5]), we see that eqnation ([6]) degenerates to 

det (Z/ ^/p + p9p — pIp) = 0 with /i = \/rii/p (A — 1). 

Hence, REGq degenerates to SMD. 

For the reader’s convenience, we summarize links between the different cases 
and the dehnitions of the corresponding matrices H and E in Fignre [H We de¬ 
note the p-dimensional Wishart distribntion with n degrees of freedom, covari¬ 
ance parameter S, and non-centrality parameter T as hFp(n, S,T). Recall that, 
if H = R'R, where the n x p matrix R is iV (M, ® S), then A ~ Wp (n, S, T) 

with the non-centrality T = Notation Wp {n, S) is nsed for the central 

Wishart distribntion. Withont loss of generality, we assnme that E = Ip. 

All the cases eventnally degenerate to SMD via seqnential asymptotic links. 
Cases SMD, PGA, and REGq, forming the npper half of the diagram, correspond 
to random H and deterministic E. The cases in the lower half of the diagram 
correspond to both H and E being random. Cases PGA and SigD are “parallel” to 
cases REGq and REG in the sense that the alternative hypothesis is characterized 
by a rank one pertnrbation of the covariance and of the non-centrality parameter 
of H for the former and for the latter two cases, respectively. Case CCA “stands 
alone” becanse of the different strnctnre of H and E. As discnssed above, CCA can 
be reinterpreted in terms of H and E snch that E is Wishart, bnt R is a non-central 
Wishart only after conditioning on a random non-centrality parameter. 
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SMD 


H = GOE/^ + <^ 
E = Ip 



ni ^ 00 ni ^ 00 


6 i/p/ni6» 9 \Jvln\9 



PCA 

mH = Wp{ni, Ip + ^) 

E = Ip 

A 


n2 ^ 00 

SigD 

riiH = Wp{ni,Ip + $) 
n2E = Wp{n2,Ip) 


REGo 


mH = Wp{ni,Ip,ni^) 


E 


n 2 ^ oo n 2 



A 


^ 00 



REG 

mH = Wp{ni,Ip,ni^) 
ria-E = Wp{n2jp) 


H = S,yS-;Sy, 
E = Sxx 


Figure 1: Matrices H and E, and links between the different cases. Matrix $ has 
the form with 6 > 0 and ||'^|| = 1. 
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Case 

pFq 

a {9) 

a 

b 

Til 

SMD 

o-fo 

exp {—p9'^/4:) 

- 

- 

9p/2 

PCA 

qFq 

(l^^)-ni/2 

- 

- 

9ni/(2 (1 + 9)) 

SigD 

iFq 

(l^^)-ni/2 

n/2 

- 

9ni/ (na (1 + 9)) 

REGo 

qFi 

exp (-ni0/2) 

- 

ni/2 

9nl/4 

REG 

iFi 

exp {—ni9/2) 

n/2 

ni/2 

9nl/ (2n2) 

GGA 

2 F 1 

(1 + ni9ln)~'^^‘^ 

(n/2, n/2) 

ni/2 

9nl/ + nani (1 + 9)) 


Table 1: Parameters of the explicit expression ([7]) for the likelihood ratios. Here 
n = Til + n2. 


3 The likelihood ratios 


Our goal is to study the asymptotic behavior of the likelihood ratios, which are 
dehned as the ratios of the joint density of Ai,Ap under the alternative to that 
under the null hypothesis, where both densities are evaluated at the observed values 
of the A’s. Let 

A = diag{Ai,...,Ap}, 

and let us denote the likelihood ratio corresponding to particular case ‘Case’ = 
‘SMD’, ‘PCA’, etc. as (0; A). Then 

L(^“-)(0;A)=a(0)pF^(a,6;vI/,A), (7) 


where T is a p-dimensional matrix diag {Tn, 0,..., 0} , and the values of Tn, a (9 ), 
p, q, a, and b are as given in Table [T] 

We prove that (0; A) is as in ([7]) in the Appendix. For PCA, the explicit 

form of the likelihood ratio is derived in Onatski et al (2013). For SigD, REGq, and 
REG, the expressions ([7]) with the parameters given in Tabled] follow, respectively, 
from equations (65), (68), and (73) of James (1964). For CCA, the expression is a 
corollary of Theorem 11.3.2 of Muirhead (1982). 

Recall that hypergeometric functions of two matrix arguments T and A are 
dehned as 


f-F? A) 

k=0 K,\-k 


(ai),....(ap),a(T)a(A) 
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where a = {ai,...,ap) and b = {bi,....,bq) are parameters, k are partitions of the 
integer k, and ( 6 j)^ are the generalized Pochhammer symbols, and are the 
zonal polynomials (see Muirhead (1982), Dehnition 7.3.2). As mentioned in the 
Introdnction, James’ (1964) classihcation of the mnltivariate statistical problems 
is based on the type of pFg that occnr in related probability distribntions. The 
function qFq of exponential type corresponds to the first class represented by PCA; 
the fnnction iFq of binomial type corresponds to the second class represented by 
SigD; the fnnction qFi of Bessel type is associated with the third class represented 
by REGq; the confluent hypergeometric function iTJ is associated with the fonrth 
class represented by REG; and the Gaussian hypergeometric function 2 F 1 corre¬ 
sponds to the fifth class represented by GGA. Note that some links between the 
cases illnstrated in Figure [T] can also be established via asymptotic relations be¬ 
tween the hypergeometric functions in the different rows of Table [H For example, 
the links REGh-^-REGq and SigDi—^-PGA as 77.2 —)■ 00 while p and ni are held con¬ 
stant follow from the confluence relations (see, for example, chapter 3.5 of Luke 
(1969)) 


oEi(&;J/,A) = lim iFi(a,b;a ^T,A) and 
oFo(^,A) = lim iFo A) . 

a^oo 

In the next section, we shall study the asymptotic behavior of the likelihood 
ratios ( 0 ) as rii, n 2 , and p go to inhnity so that 

Cl = 7 i e ( 0 , 1 ) and C 2 = -^ ^- 72 G ( 0 , 1 ]. ( 8 ) 

77-1 772 

We denote this asymptotic regime as n,p 00 , where n = { 771 , 772 } and 7 = 
{ 71 , 72 }. To make our exposition as uniform as possible, we use this notation for 
all the cases, even though the simpler ones, such as SMD, do not refer to n. In the 
Gonclusion, we briefly discuss possible extensions of our analysis to the situations 
with 7 i > 1 . 

We are interested in the asymptotics of the likelihood ratios under the null 
hypothesis, that is when the true value of the spike, equals zero. Before turning 
to the next section, let us provide a relevant background on the asymptotics of 
A. Under the null, Ai,...,Ap are the eigenvalues of GOEj^ in SMD case; of 
lUp(77i,Ip) /771 in PGA and REGq cases; and of a scaled (by a factor of 772 / 711 ) 
p-dimensional multivariate beta matrix with parameters 77i/2 and 772/2 in SigD, 
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Case 

SMD 


±2 


PCA 

REGo 

SigD 

REG 

GGA 


density for /?_ < A < /?+ 

sc 

MP ^y(/3+-A)(A-/3_) 

W 2,AAGa| V(^+-^)(^-/^-) 


(1 ± V^)'' 



Threshold 9 
1 


\/A 


72+P 

1—72 


Table 2: The semi-circle, Marchenko-Pastur, and (scaled) Wachter distributions. 
Here p = \/7i + 72 — 7 i 72 - In the case where 71 > 1, which is not considered in 
this paper, the Marchenko-Pastur and Wachter distributions will also have mass 
(71 — l)/ 7 i at zero. Golumn ‘Threshold 9’’ reports the values of the phase transition 
thresholds. 


REG, and GGA cases. For a definition of the multivariate beta, see Muir head 
(1982), p. 110. 

Let 

1 ^ 

piCase) {Aj. < A} 

^i=i 

be the empirical distribution of Ai,...,Ap. As is well known (see Bai (1999)), as 
n, p 00 , almost surely (a.s.) weakly converges 

p(Case) 


where Fj™ is the semi-circle distribution in SMD case; the Marchenko-Pastur 
distribution in PGA and REGq cases; and the (scaled) Wachter distribution 
F]^ in SigD, REG, and GGA cases. Table |2] reports the explicit forms of these 
limiting distributions. Note that the cumulative distribution functions F^™ (A) are 
linked in the sense that F^ (A) ^ F^^^ (A) when 72 —)■ 0 and F^^ (\/mA -|- l) —)■ 
F^^ (A) when 71 —)■ 0. 

For what follows it will be important that the centered linear spectral statistics 

p 

if (Aj) - p 

where p is a ‘well-behaved’ function, converge in distribution to Gaussian random 
variables. The corresponding GLTs are established in Bai and Yao (2005), Bai and 
Silverstein (2004), and Zheng (2012) for the cases of the semi-circle, Marchenko- 



p (A) dF^- (A), 


(9) 
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Pastur, and Wachter limiting distribntions, respectively. Note that the centering 
constant is dehned in terms of where c = {ci,C 2 }. That is, the “correct 
centering” can be compnted using the densities from Table [2], where 71 and 72 are 
replaced by ci = p/ni and C 2 = p/'n. 2 , respectively. 

Finally, let us note the behavior of the largest eigenvalue Ai under the alter¬ 
native hypothesis. As is well known, Ai a.s. converges to the upper boundary of 
support of as long as 9 remains below the phase transition threshold 9. The 
value of the threshold is reported in the last column of Table [2j When 9 > 9, \i 
separates from ‘the bulk’ of the other eigenvalues and a.s. converges to a point 
strictly above the upper boundary of the support of Fj‘™. For details, we refer the 
reader to Ma’ida (2007), Baik and Silverstein (2006), Nadakuditi and Silverstein 
(2010), Onatski (2007), Dharmawansa et al (2014a), and Bao et al (2014) for cases 
SMD, PCA, SigD, REGo, REG, and GGA, respectively. 

The fact that Ai converges to different limits under the null and under the 
alternative hypothesis sheds light on the behavior of the likelihood ratio when 9 is 
above the phase transition threshold. In such cases, which can be called the cases 
of super-critical 6 *, the likelihood ratio degenerates. The sequences of measures 
corresponding to the distributions of A under the null and under super-critical 
alternatives are asymptotically mutually singular as n, p 00 (see Montanari et 
al (2014) and Onatski et al (2013) for a detailed analysis of SMD and PGA cases). 
In contrast, as we shall show below, the sequences of measures corresponding to 
the distributions of A under the null and under sub-critical alternatives {9 is below 
the threshold) are mutually contiguous, and the likelihood ratio converges to a 
Gaussian process. 

4 Contour integral representation 

Asymptotic behavior of the likelihood ratios ([7]) depends on that of pFg (a, 6 ; T, A). 
There is a large and well established literature on the asymptotics of pFq (a, 6 ; T, A) 
when the parameters and the norm of the matrix arguments grow while the di¬ 
mensionality of the latter remains hxed (see Muirhead (1978) for a review). In 
contrast, relatively little is known about the asymptotic regime that allows the 
dimensionality of the matrix arguments 4/,A diverge to inhnity. In this paper, 
we investigate such an asymptotic regime. We exploit the fact that, since we 
study single-spiked models, the matrix argument T has rank one. This allows us 
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to represent pFg {a,b]^, A) in the form of a contour integral of a hypergeometric 
function with a single scalar argument. Such a representation implies contour in¬ 
tegral representations for the corresponding likelihood ratios, which we summarize 
in the following lemma. The results of the lemma are used below to derive the 
asymptotics of the likelihood ratios via the Laplace approximation. 

In what follows, we omit the superscripts ‘(Case)’ and ‘lim’ for quantities such 
as (6*; A), (A), and (A) to simplify our notation. However, we 

shall use these superscripts to identify particular instances, when necessary. 

Lemma 1 Assume that p < min {ni, 112}. Let K, he a contour in the complex 
plane C that starts at — 00 , encircles 0 and Ai,..., Ap counterclockwise, and returns 
to — 00 . Then 

L [0]-^) = [ pFg{a-s,b-s]'^iiz)Y[{z-Xj)~^^^dz, ( 10 ) 

Jk. 

where s = p/2 — 1, the values of a{9), Tn, a, b, p, and q for the different cases 
are given in Table a — s and b — s denote vectors with elements Oj — s and 
bj — s, respectively; the hypergeometric function under the integral is the standard 
hypergeometric function of a scalar argument; and 

^ A r(aj-g) -A r(5i) 

i.i 

In cases SigD and CCA, we require, in addition, that the contour K, does not 
intersect cxo), which ensures the analyticity of the integrand in an open subset 
of C that includes /C. 

The statement of the lemma immediately follows from Proposition 1 of Dhar- 
mawansa and Johnstone (2014) and from equation ([7]). Our next step is to apply 
the Laplace approximation to integrals fllOl) . To this end, we shall transform the 
right hand side of flTOl) so that it has a “Laplace form” 

L (0; A) = exp {-pf{z)} g{z)dz. (11) 

Leaving ^/^lp/ (27ri) separate from g{z) allows us to choose f{z) and g{z) that are 
bounded in probability, and makes some of the expressions below more compact. 


15 





Case 

Vi 

P//(l + 0(1)) 

SMD 

l + 9'^/2 + \n 9 

9 

PCA 

1+i^ln(l + 0) + ln|- 

9 of\\A 9 y^ 

SigD 

2fiPCA) 

■' J Cl C1C2 C1+C2 

9 oy (1 -h 9 )~^ r (ci -1- €2)^^^ 

REGo 

l + ^ + lnA + l^ln(l-ci) 

9 oy (1 - 

REG 

2AREG0) l + lnCi+C 2 r -2 

•’I Cl C 1 C 2 C 1 +C 2 

9 oy (1 - r (ci + 02f^‘^ 

GGA 

9 r{REG) I 0 A A 

^JI Cl C1C2 cil 

9 oy {1 - (C1 + C2)/- 


Table 3: Values of 2// and gi/{)- + o(l)) for the different cases. The terms o(l) 
do not depend on 9 and converge to zero as n, p oo. The term is defined as 
= Cl + C 2 — C 1 C 2 . The term I = l{ 6 ) is dehned as l{ 6 ) = ! + (! + 6 ) 02 / 01 . 


In order to apply the Laplace approximation, we shall deform the contour of in¬ 
tegration so that it passes through a critical point Zq of f{z) and is such that 
Ref{z) is strictly increasing as 2 ; moves away from zq along the contour, at least 
in a vicinity of zq. 


4.1 The Laplace form 

We shall transform (fTOj) to (fTTD in three steps. As a result, functions / and g will 
have the forms of a sum and a product. 


/ (z) = fi + fii (z) + fill (z) and 

giz) = gi X gii (z) x giii(z), 


where // and gj do not depend on 2 ;. 

First, using the definitions of a (9 ), Qs, thu and employing Stirling’s approxi¬ 
mation, we obtain a decomposition 


r (g -H) g (9) Qs 


exp{-pfi}gi, 


( 12 ) 


where gj remains boudned as n,p —)-..y cxd. The values of 2// and gj are given in 
Tables It should be noted that ^ fiREGo) ^ ^(pca) 

cLs C 2 —y 0. 
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Next, we consider the decomposition 


p 

n = exp {-pfii{z)} gii{z), 

j=i 


(13) 


where 

2fii{z) = j ln(z-A)dFe(A), (14) 

and 

gn{,z) = exp |-| j In (^ - A) d (^F (A) - (A)) |. (15) 

For fn{z) and gii{z) to be well-dehned we need 2 : not to belong to the support 
of Fc, which we assumed Note that gii{z) is the exponent of a linear spectral 
statistic, which converges to a Gaussian random variable as n,p —)--y 00 under 
the null hypothesis. Since F^ {X) —)■ F^^{X) as C 2 —)• 0, we have fn''^^\z) = 
fif^^\z) = fif^^\z) converging to fif^^\z) = fif^^°\z). 

Finally, we obtain a decomposition 

pFg {a- s,b- s; d'nz) = exp {-pfiii{z)} giii{z). (16) 

For SMD, PGA, and SigD, the corresponding pFq can be expressed in terms of 
elementary functions, and we set 


r -zO for SMD 

2fiii{z) = I -ze/ (ci (1 + 9)) for PGA , (17) 

[ In [1 - C2z9I {ci (1 + 0)}] r‘^1 (C 1 C 2 ) for SigD 


giii{z) — 


As C 2 —)■ 0, f\^jY^\z) converges to / 


(18) 


and 

' 1 for SMD and PGA 

[1 - C 22 : 6 '/{ci (1 + 6 ')}]“^ for SigD 

Since, as has been shown above, a 
similar convergence holds for fj and ///, we have /(‘^*s^)( 2 ;) —)■ as C 2 —)■ 0 . 

Gombining flT^ and flTTll with the information supplied by Table El we also see 
that ^ as Cl —)■ 0 after the transformations 9 1 —)■ ^/^9 and 

z ^t yjC\Z T 1. 

Unfortunately, for REGq, REG, and GGA, the corresponding pFp do not admit 


^By definition, contour /C encircles the support of F, and hence z € 1C does not belong to such 
support. 
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exact representations in terms of elementary functions. Therefore, we shall consider 
their asymptotic approximations instead. Let 


m = {rii — p) /2 and e = {n — p) / (ni — p). 

Further, let 

( zOI (1 — cif for j = 0 

Pj = < Z 0 C 2 I [ci (1 - Cl)] for j = 1 , (19) 

[ zdcH [cll (0)] for i = 2 

where 

I { 0 ) = 1 + (1 + 0) C 2 /Cl. (20) 

With this notation, we have 

( oFi {m+ 1; m^po) = Fo for REGq 

pFg = < iFi (me + 1, m + 1; m? 7 i) = Fi for REG . (21) 

[ 2-^1 (^rie + 1, me + 1; m + 1; 772 ) = F 2 for GGA 

The function Fq can be expressed in terms of the modihed Bessel function of 
the first kind Im (•) as (see Abramowitz and Stegun (1964), equation 9.6.47) 

Fo = r (m + 1) (rri^po) Im (2mpl^‘^^ . (22) 

This representation allows us to use a known uniform asymptotic approximation of 
the Bessel function (see Abramowitz and Stegun (1964), equation 9.7.7) to obtain 
the following lemma. Let 

ipo(t) = \nt - t - po/t + 1 and to = ^1 + a/ 1 + 4?7oj /2. (23) 

Further, for any 5 > 0, let be the set of ? 7 o G C such that 

|arg77o| < TT - 5, and po ^ 0. 

Lemma 2 As m —)■ cxd, we have 

Fo = (l + 4?7o)"^'^'^exp{-m(^o(^o)}(l + o(l))- (24) 

The convergence o(l) —)■ 0 holds uniformly with respect to po G Go <5 for any 5 > 0. 
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We would like to point out that the right hand side of fl24l) can be formally 
linked, via ([22]), to the saddle-point approximation of the integral representation 
(see Watson (1944), p. 181) 

7?7./2 

Ira j exp{-m(po{t)}t~^dt. 


Point to can be interpreted as a saddle point of (po (t ), and the term (1 -|- 
in fl2411 can be interpreted as a factor of (to))~^^^- 

To obtain uniform asymptotic approximations to functions Fi and T 2 , we use 
the contour integral representations (see Olver et al (2010), equations 13.4.9 and 
15.6.2) 

C /■P+) 

^ y exp {-rriLpj it)]'ll)j (t) dt, (25) 


where 


a 


r (m -I- 1) r (m (e — 1) -f 1 ) 

r {me + 1) 


(26) 


and 


p}j{t) 


—rjjt — £ Int -|- (e — 1) In (t — 1) for j = 1 

—e In {t/ (1 — r]jt)) -|- (e — 1) In {t — 1) for j = 2 




(t — 1) ^ for j = 1 

{t - 1)"^ (1 - r]jty^ for j = 2 


(27) 


(28) 


For j = 2, the contour does not encircle l/r] 2 , and the representation is valid for 
r ]2 such that |arg (1 — 772 )! < tt. We obtain the following lemma by deriving a 
saddle-point approximation to the integral in fl25l) . The relevant saddle points are 


^ _ I + \/ iVj - 1)' + 4£r7,| for j = 1 

[ 27-(^{-1 ++ ioTj = 2 

We shall need the following additional notation. Let 


(29) 


Uj = arg if'' {tj) + TT and uqj = arg {tj — 1), 


(30) 


where the branches of arg (•) are chosen so that \uj -|- 2a;oj| t^/2. Further, for any 
small h > 0 let be the set of {e,rii) G M x C such that 5 < e — 1 <1/5, and 


Re r/i > —25-f-1, dist ( 771 , R\ [0, cxd)) > 5, | 77 i| < 1/5. 
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Figure 2: Cross-sections of the sets for e = 2 and 5 = 0.1. The horizontal and 
vertical axes correspond to the real and purely imaginary numbers, respectively. 

Similarly, let r 22<5 be the set of (e, 772 ) G M x C such that 5 < e — 1 <1/5^ and 

dist (772,IR\ [0,1]) > 5, 1 ^ 72 ! < 1/5- 

Here, for any H C C and H C C, dist {A,B) = m.ia^A,b&B \a — h\. Figure [2] shows 
cross-sections of and 122(5 for hxed e. 

Lemma 3 Hs m —)■ 00 , we have for j = 1, 2 

Fj = Cm'ipj (tj) \2'Kmip] {tf) \ exp {-m(pj {tj)} (1 o(l)). (31) 

The convergence o(l) —)■ 0 holds uniformly with respect to (e, rj) G Qjs for any 
h > 0. 

Point-wise asymptotic approximation fl3T|) was established in Passemier et al 
(2014) for j = 1, and in Paris (2013a,b) for j = 2. However, those papers do 
not study the uniformity of the approximation error, which is important for our 
analysis. A proof of Lemma [3] is available from the authors upon request. We shall 
report it elsewhere. 

Using Lemmas El |3l and Stirling’s approximation 

Cm = ^ ^ ^ exp {m {e — 1) In (e: — 1) — me Ine:} (1 -|- o(l)) (32) 
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we set the components of the “Laplace form” (fT 6 |) of pFg for cases REGq, REG, 
and GGA as follows 


‘2fiii{z) 


and 


^^0 (to) 

^ (tj) + £ hie - (£ - 1) In (e - 1)) 


for REGo 

for REG and GGA 

(33) 


gni{.z) 


(l + 4r/o)-'/" (1 + 0(1)) 

y. (t^) {tj) (1 + o ( 1 )) 


for REGo 

for REG and GGA 


(34) 

To express tj and rjj in terms of z, one should use fl^ and ([T9]). We do not need 
to know how exactly o (1) in fl5T|) depend on z. For our purposes, the knowledge 
of the fact that o ( 1 ) are analytic functions of rjj that converge to zero uniformly 
with respect to {e,r]j) G Qjs is sufficient. The analyticity of o(l) follows from the 
analyticity of the functions on the left hand sides, and of the factors of 1 + o(l) on 
the right hand sides of the equations fl2Tll and (l3lll . 

Using the dehnitions of (fj and tj, it is straightforward to verify that fjff'^\z) 
and converge to fjff^°\z) as C 2 —)■ 0. Since, as has been shown 

above, a similar convergence holds for fj and fu, we have —)■ 

j{REGo)(^^^ as C 2 —)■ 0 . Elementary calculations that use equations f[T4ll . ([23]), fl3^ 
together with the explicit forms of and given in Table [3] show 

that /(^®'^o)( 2 ;) as Cl ^ 0 after transformations 6 i—)■ ^/F[6 and 

Z !-)■ y/oiz + 1. 


4.2 Contours of steep descent 

We shall now show how to deform contours /C in (ITTil into the contours of steep 
descent. First, we hnd saddle points of functions f\z) for each of the six cases. 
Note that the derivative of fn{z) equals minus half of the Stieltjes transform 
rric {z) of the corresponding limiting spectral distribution F^. Although the Stieltjes 
transform is formally dehned on C"*", the dehnition remains valid on the part of 
the real line outside the support [ 6 _, 6 +] of F^.. Since we assume that 71 < 1, F^. 
does not have any non-trivial mass at 0 for sufficiently large n and p. 

To hnd saddle points of f{z) we solve equation 

me {z) = 2dfin{.z)/<lz. (35) 
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In the Appendix, we find real solutions to fl35|) . Zq, that satisfy inequality zq > b+. 
These solutions are reported in the following lemma. 

Lemma 4 Let 6+ be the upper boundary of support of F^, and 6 he the threshold 
corresponding to as given in Table\^ Then, for 9 G (O,0) and sufficiently large 
n and p as n,p oo, 

( 9+ 1/9 for SMD 

^0 = < (1 + 0) (0 + Cl) /9 for PCA and REGq (36) 

[ (1 + 0) (0 + Cl) / [91 (0)] for SigD, REG, and GGA 

satisfy ineguality Zq > 6+ and solve eguation TSB/ . 

As C2 —)■ 0 while ci stays constant, the value of zq for SigD, REG, and CCA 
converges to that for PCA and REGq. The latter value in its turn converges to 
the value of zo for SMD when ci —)■ 0, after the transformations 9 i—)■ y/ci9 and 
^0 '“t ^/c/zo + 1. Precisely, solving equation 

\/ciZo + ! = (! + \/ci9) (\/ci9 + Cl) / (\/ci9) 

for Zo and taking limit as ci 0 yields zq = 9 + 1/9. 

For the rest of the paper, assume that 9 E (O, 0). We deform contour /C in flTTD 
so that it passes through the saddle point Zq as follows. Let /C = /C+ U /C_, where 
/C_ is the complex conjugate of /C+ and /C+ = /Ci U/C 2 . For SMD, PCA, and SigD, 
let 


= {^0 + it : 0 < f < 2zo} and (37) 

/C 2 = {x + i2zo : —00 < X < Zq} . (38) 

The deformed contour is shown on Figure [3l 

Note that the singularities of the integrand in (1TT|1 are situated at z = \j (plus 
an additional singularity at z = ci(l + 9)/ { 9 c 2 ) < zq for SigD). Since Ai 
and Zo > &+, inequality 2^0 > Ai must hold with probability approaching one as 
n, p -E.y 00 . Therefore by Cauchy’s theorem, the deformation of the contour does 
not change the value of L [9] A) with probability approaching one as n,p 00 . 

Strictly speaking, the deformation of the contour is not continuous because /C+ 
does not approach /C_ at — cx). In particular, in contrast to the original contour, 
the deformed one is not “closed” at — cxo. Nevertheless, such an “opening up” at 
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Figure 3: Deformed contour /C for SMD, PCA, and SigD. 

—oo does not lead to the change of the value of the integral because the integrand 
converges fast to zero by absolute value as Re;^ ^ —cxo. 

Remark 5 In the event of asymptotically negligeable probability that the deformed 
contour 1C does not encircle all Xj, we not only loose the eguality m but also face 
the difficulty that function g{z) ceases to be well defined as the definition of gifiz) 
contains a logarithm of a non-positive number. To eliminate any ambiguity, if such 
an event holds we shall redefine gifiz) as unity. 

For REGo and CCA, let 

_ f - (1 - cif / [46] for REGo 

[ -Cl (1 - ci)^ I (6) / [AOr^] for CCA ’ 

and let 


fCi = {zi + \zo - Zil exp {iy} : 7 E [ 0 , 7 r/ 2 ]} and 
= {d — X 4- \zq — Zi \ exp {i7r/2} : x > 0} . 

The corresponding contour K, is shown on Figure HI Similarly to the SMD, PCA 
and SigD cases, the deformation of the contour in (fTTD to /C does not change the 
value of L [6] A) with probability approaching one as n,p —cxo. 
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Figure 4: Deformed contour /C for REGq and CCA. 

For RFC, deformed contour K in ^r-plane is simpler to describe as an image of 
a contour C in r-plane, where r = rjiti with 


7]l = Zdc2l [Ci (1 - Cl)] 


(39) 


and ti as dehned in fl29|) . Let C = C+ U C_, where C_ is the complex conjugate of 
C+ and C+ = Cl U C 2 , and let 

Cl = {-e + |ro + £| exp { 17 } : 7 e [ 0 , vr/ 2 ]} and 
C 2 = {-e - X + \to + e\ exp {ivr/2} : x > 0} , 


where tq = {9 + ci) / (1 - ci). 
Using fl3^ and the identity 


hi = 'r{T + l)/{r + e), 


(40) 


we obtain 


z = 


Cl (1 - Cl) r (r + 1) 


(41) 

9 c2 t + e 

We dehne the deformed contour /C in z-plane as the image of C under the trasfor- 
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mation r ^ z given by fHTj) . The parts /C+, /C_, /Ci and JC 2 of /C are defined as the 
images of the corresponding parts of C. Note that Tq is transformed to Zq so that 
JC passes through the saddle point zo- 

The following lemma is proven in the Appendix. It shows that /Ci are contours 
of steep descent of — Re / {z) for all the six cases, SMD, PCA, SigD, REGq, REG, 
and GGA. 

Lemma 6 For any of the six cases that we study, as z moves along the corre¬ 
sponding /Cl away from zq, — Ref (z) is strictly decreasing. 

5 Laplace approximation 

The goal of this section is to derive Laplace approximations to the integrals 



for the six cases that we study. First, consider a general integral 



where p —)■ cxo, a; is a fc-dimensional parameter that belongs to a subset G of M*', 
is a path in C that starts at and ends at hp^^, and for sufficiently large p, 
4 >p,u]{z) is a single-valued holomorphic function of ^ in a domain Tp^^^ that contains 


We allow Xp,u}{^) fo be a random element of the normed space of continuous 
functions on with the supremum norm. Furthermore, we suppose that for any 
(5 > 0, there exists p such that for any p > p, Xp,u}{.^) is a single-valued holomorphic 
function of z in the domain Tp with probability larger than 1 — 5. In what follows, 
we shall omit subscripts p and u from the notation (fp^uj, Xp,Ljy ^p,u}j to make 
it lighter. 

Suppose that 0' {z) = 0 at Zq which is an interior point of /C, and suppose 
that Re0(;2) is strictly increasing as ^ moves away from zq along the path. In 
other words, the path /C is a contour of steep descent of — Re0(2;). Denote a 
closed segment of /C contained between zi and Z 2 as [zi, Z 2 \fQ. Similarly denote the 
segments that exclude one or both endpoints as [zi,Z 2 )f^, {zi,Z 2 ]fQ, and [zi,Z 2 )f^. 
Let (3 be the limiting value of arg {z — zq) on the principal branch as z ^ zq along 
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{zo,b)j^. Finally, let 0* and y* with s = 0,1,... be the coefficients in the power 
series representations 

OD OO 

<Piz) ~ = X] “ ^o)" • (42) 

s=0 s=0 

We assume that there exist positive constants Ci,..., C 4 that do not depend on 
p and on co, such that for all a; G hi, for sufficiently large p : 

AO The length of the path /C is bounded, uniformly over a; G and all sufficiently 
large p. Furthermore, 

sup \z — Zo\ > Cl, and sup \z — Zo\ > Ci 

ze{zo,b)i^ ze{a,zo)^ 

A1 Functions 0 {z) and x{^) are holomorphic in the ball \z — zo\ < Ci 

A2 The coefficient 02 satisfies C 2 < | 02 | < C 3 

A3 The third derivative of 0 {z) satishes inequality 

sup |d ^0 ( 2 :)/d 2 :^| < C 4 

lz-zol<Ci 

A4 For any positive e < Ci, which does not depend on p and u, and for all 
Zi E 1C such that \zi — Zq\ = e, there exist positive constants C 5 ,C 6 , such 
that 

Re (0 ( 2 : 1 ) - 0o) > C 5 and |Im (0 ( 2 : 1 ) - 0o)| < Ce 

A5 For a subset 0 of C that consists of all points whose Euclidean distance from 
/C is no larger than Ci, 

sup \x{z)\ = Op(l) 

260 

as p —)■ OO, where Op(l) is uniform in a; G fl. 

The following lemma is a fairly straightforward extension of Theorem 7.1 of 
Olver (1997), p. 127 to the situation where functions 0 ( 2 :), x{^) and the contour 
/C depend on p and u. In Giver’s original theorem, which uses different notation, 
both the functions and the contour are hxed. A proof of the extension is available 
from the authors upon request. 
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Case 

Value of D 2 

Case 

Value of D 2 

SMD 

1 

to 

REGo 

Cl (1 -F Cl -f- 29) (ci — 9“^) 

PCA 

Cl (ci - 9^) (1 + 9f 

REG 

Cih (ci + 0 + (1 + 9) 1) /U 

SigD 

r^h{l + 9f /U 

CCA 

c\h (2 (ci + 9) + 1 (1 — Cl)) / {U (ci -|- C 2 )) 


Table 4: The values of D 2 = 6'^{—2d‘^f (zq)/ dz‘^) Here I = l{9) is as defined in 

fl20|) and h = h{9) = Cj + 02(1 + 9Y — 6*^. 


Lemma 7 

have 


Under assumptions A0-A5, for any positive integer k, as p ^ 00 , we 


j — Op-p<Po 


K—1 

i:- 

s=0 


1 

"+2 


tt2s 


p- 


:S+l/2 


+ 


Op (1) 


P‘ 


A+1/2 


where Op (1) is uniform in u & fl and the coefficients aj can be expressed through 
4>i and Xi defined above. In particular we have Oq = Xo/[202^^], where = 
exp {(log |02| + arg02) / 2 } with the branch o/arg02 chosen so that \axg(f >2 + 2 / 3 | < 
7 r/ 2 . 


We use Lemma [7] to obtain the Laplace approximation to 


L^{9-k) = ^^ [ e-Pf^^^g{z)dz. 
27n J/Ciu/Ci 


(43) 


Then we show that Li (0; A) asymptotically dominates the “residual” L (6*; A) — 
Li{9]A). For this analysis, it is important to know the values of f{zo) and 
d"^f (zq)/ dz"^. We derive them in the Appendix. It turns out that as long as 
9 G [ 0 , 9 ), f{zo) = 0 for all the six cases that we study. The values of d‘^f{zo)/dz^ 
are all negative. The explicit form of D 2 = 9“^ {—2d‘^f{zo)/dz'^) ^, which is some¬ 
what shorter than that for d'^ f (zq)/ dz"^ is reported in Table 01 We formulate the 
main result of this section in the following theorem. Its proof is given in the 
Appendix. 

Theorem 8 Suppose that the null hypothesis holds, that is, 9 q = 0 . Let 9 be the 
threshold corresponding to F.y as given in TablelE, o.nd let e be an arbitrarily small 
fixed positive number. Then for any 6* G (O, 0 — e] , as n,p 00 , we have 


L{9;A) 


gjzo) 

^y-2d^f{zo)/dz^ 


+ Op {p , 


(44) 
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where Op {p is uniform in 9 E (0,6* — and the principal branch of the square 
root is taken. 


6 Asymptotics of LR 


Combining the results of Theorem [8] with the dehnitions of g{z) and the values of 
—2d‘^f{zQ)/dz‘^ (given in Table llj), it is straightforward to establish the following 
theorem. Let 

Ap(0) = p y In (zo - A) d (f (A) - Fe (A)) . 

In accordance with the remark made above, we dehne Ap{6) as zero in the event 
of asymptotically negligeable probability that ; 2 o < Ai. 

Theorem 9 Suppose that the null hypothesis holds, that is 6 q = 0. Let 6 be the 
threshold corresponding to F^ as given in Table\^ and let e be an arbitrarily small 
fixed positive number. Then for any 9 E (O, 0 — e] , as n,p oo, we have 


L {9- A) = exp <1 -^Ap{9) + ^ In (l - [5p (0)]^) \ (1 + op(l)), 


where 


9 for SMD 

(S) = { 0/^ for PCA and REG^ 

9r/ [cil (9)) for SigD, REG, and GGA 


and op(l) is uniform in 9 E {0,9 — e~\ 


Statistic Ap{9) is a linear spectral statistic. As follows from the CLT derived 
by Bai and Yao (2005), Bai and Silverstein (2004), and Zheng (2012) for the 
semi-circle, Marchenko-Pastur, and Wachter limiting distributions Fc , respectively, 
statistic Ap(9) weakly converges to a Gaussian process indexed by 0 G (O, 0 — e] . 
The explicit form of the mean and the covariance structure can be obtained from 
the general formulae for the asymptotic mean and covariance of linear spectral 
statistics given in Theorem 1.1 of Bai and Yao (2005) for SMD, in Theorem 1.1 of 
Bai and Silverstein (2004) for PCA and REGq, and in Theorem 4.1 and Example 4.1 
of Zheng (2012) for the remaining cases. For PCA, the corresponding calculations 
have been done in Onatski et al (2013). We omit details of the similar calculations 
for the remaining cases to save space. The convergence of Ap{9) and Theorem [9] 
imply the following theorem. 
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Theorem 10 Suppose that the null hypothesis holds, that is 9 q = 0. Let 6 he the 
threshold corresponding to as given in Table\^ and let e he an arbitrarily small 
fixed positive number. Further, let C [O, i9 — e:j he the space of continuous functions 
on [O, 0 — e] equipped with the supremum norm. Then InL (9; A) viewed as random 
elements of C [O, 9 — e~\ converge weakly to C (9) with Gaussian finite dimensional 
distributions such that 

E£ (9) = i 1,1 (1 - [i (9)]^) 

and 

Cov (£ {9,),C ( 02 )) = In (1 - 5 (0i) 5 (^ 2 )) 

with 

{ e for SMD 

e/^ for PGA and REG^ 

9p/ (71 + 72 + 072 ) for SigD, REG, and GGA 

Here p, 71,72 are the limits ofr,ci,C 2 as n,p 00 . 

Note that the theorem establishes the weak convergence of the log likelihood 
ratio viewed as a random element of the space of continuous functions. This is 
much stronger than simply the convergence of the hnite dimensional distributions 
of the log likelihood process. In particular, the theorem can be used to hnd the 
asymptotic distribution of the supremum of the likelihood ratio, and thus, to hnd 
the asymptotic critical values of the likelihood ratio test. We do not pursue this 
line of research here. 

Let {IPp, 6 )} and {Pp,o} be the sequences of measures corresponding to the joint 
distributions of Ai,..., Ap when 0o = 0 and when 0o = 0 respectively. Then Theorem 
[To] implies, via Le Cam’s hrst lemma, the mutual contiguity of {Pp,^} and {Pp,o} as 
n,p 00 . This reveals the statistical meaning of the phase transition thresholds 
as the upper boundaries of the contiguity regions for spiked models. 

The precise form of the autocovariance of C (0) shows thatj^ although the ex¬ 
periment of observing Ai,...,Ap is asymptotically normal, it does not converge to 
a Gaussian shift experiment. In particular, the optimality results available for 
Gaussian shifts cannot be used in our framework. To analyze asymptotic risks of 
various statistical problems related to the experiment of observing Ai,...,Ap, one 

^Fyodorov et al (2013) have an interesting discussion of ubiquity of random processes with 
logarithmic covariance structure in physics and engineering applications. In their paper, such 
processes appear as limiting objects related to the behavior of the characteristic polynomials of 
large matrices from Gaussian Unitary Ensemble. 
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should directly use Theorem [TOl 

In this paper, we use Theorem [10] to derive the asymptotic power envelopes 
for tests of the null hypothesis 6 *o = 0 against the alternative 6 *o > 0. Such a 
power envelope has been derived by Onatski et al (2013) for the case of PCA. By 
the Neyman-Pearson lemma, the most powerful test of the null against a point 
alternative Oq = 6 would reject the null when InL (0; A) is above a critical value. 
By Theorem [TUI and Le Cam’s third lemma (see van der Vaart (1998), chapter 6 ), 

In L («; A) 4 Af (^i In (1 - (»)|^) , -i In (1 - [i (»)|^)) 

under the null, and 

In L («; A) 4 /V (^-1 In (1 - (d)]^) , -1 In (l - 

under the alternative. This implies the following theorem. 


Theorem 11 Let 6 be the threshold corresponding to as given in Tahle\^ For 
any 9 E [O, 0) , the value of the asymptotic power envelope for the tests of the 
null 9q = 0 against the alternative 9q > 0 which are based on Ai,...,Ap and have 
asymptotic size a is given by 


PE {9) 


1 - <h 


(1 - a) 



ln(l-[i(9)n 


Here <h denotes the standard normal cumulative distribution function. For 9 > 9 
the value of the asymptotic power envelope equals one. 


The envelopes are different only for the cases that correspond to different limit¬ 
ing spectral distributions: the semi-circle, the Marchenko-Pastur, and the Wachter 
distribution. Therefore, we can denote PE (9) as PE^^ (9) for SMD, as PE^^ (9) 
for PCA and REGq, and as PE^ (9) for the remaining cases. Figure |5] shows the 
graphs of the envelopes for a = 0.05 and 7 i = 72 = 0.9. Such large values of 71 and 
72 correspond to situations where the dimensionality p is not very different from 
the “sample sizes” rii and 77 - 2 . Of course, the values of 71 and 72 are irrelevant for 
PE^^ (9), and the value of 72 is irrelevant for PE^^ (9). 

Note that the asymptotic power envelope PE^^ [9) can be obtained from 
PE^ (9) by sending 72 to zero. Further, PE^^ [9) can be obtained from PE^^ {9) 
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Figure 5: The asymptotic power envelopes PE^^{6), PE^^{6), and PE^{6) for 
a = 0.05, 7i = 72 = 0.9. 
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by transformation 6 \—)■ ^/^6. Fnrther, note the difference in the horizontal scale 
of the bottom panel of Fignre [5] relative to the two other panels. For 71 = 72 = 0.9 
the phase transition threshold corresponding to Wachter distribntion is relatively 
large. It eqnals (72 + p) /(I — 72 ) ~ 18.9. Moreover, the valne of PE^ (9) be¬ 
comes snbstantially larger than the nominal size a = 0.05 for 6 that are sitnated 
far below this threshold. This snggests that the information in all the eigenvalnes 
Ai,...,Ap might be effectively nsed to detect spikes that are small relative to the 
phase transition threshold in two sample problems. We leave a conhrmation or 
rejection of this specnlation for fnture research. 

7 Conclusion 

This paper derives the asymptotics of the likelihood ratio processes corresponding 
to the nnll hypothesis of no spikes and the alternative of a single spike in varions 
high-dimensional mnltivariate models. We cover all the hve classes of mnltivariate 
statistical problems identihed by James (1964). In addition, we consider a sym¬ 
metric matrix denoising problem that does not £t in James’ classification. We find 
that, as the dimensionality and the nnmber of observations go to inhnity propor¬ 
tionally, the log likelihood processes converge to Ganssian limits as long as the 
valne of the spike parameter is below corresponding phase transition thresholds. 
We derive explicit formulae for the autocovariance and the mean of the limiting 
processes and use them to obtain asymptotic power envelopes for tests for the 
presence of a spike. 

In this paper, we make the assumption that n 2 > p to ensure the invertibility 
of matrix in ([T]) with probability one. However, we also make the assumption 
ni > p, which can be lifted without a substantial reformulation of the problem. We 
make the latter assumption mostly to simplify our exposition. The assumption is 
irrelevant for SMD. The PGA results are obtained in Onatski et al (2013) without 
using this assumption. For SigD, our derivations (not reported here) show that the 
equivalent of ([7]) for rii < p involves the hypergeometric function 2 T’i- Therefore, 
SigD with Til < p represents the hfth, rather than the second, group of multivariate 
statistical problems according to James’ (1964) classihcation. The REGq problem 
is symmetric with respect to ni and p after a simple reparametrization. For REG, 
an equivalent of ([7]) for rii < p can be obtained using equation (74) of James 
(1964). However, further analysis of REG in this situation needs more substantial 
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changes to our analysis. In CCA case, the sample canonical correlations are only 
well dehned if rii > p. To summarize, when rii < p, the only interesting untreated 
cases are SigD and REG. We leave their study for future research. 


8 Appendix 


8.1 SMD entry of Table [T] 

The explicit expression for (6*; A) given in Table [H follows from the following 

lemma. 

Lemma 12 For SMD case, the joint density of the diagonal elements of A evalu¬ 
ated at the diagonal elements of x = diag{a:i, ...,Xp} with xi > ... > Xp eguals 

Cp (x) exp {-p0^/4j qFq (T, x) , (45) 


where Cp (x) is a guantity that depends on p and x, but not on 9, and T = 
diag {6*p/2, 0,..., 0}. The density under the null hypothesis is obtained from the 
above expression by setting 9 = 0. 

Proof: The proof is based on the “symmetrization trick” used by James (1955) 
to derive the density of non-central Wishart distribution. Let Y = U'XU, where 
f/ is a random matrix from 0{p) and X = Z/yjp + p9p' with Z from GOE, 6^ > 0, 
and IIpII = 1. Note that the eigenvalues of X and Y are the same. The joint density 
of the functionally independent elements of Y evaluated at y is 


Jo ip) 



p9p'f^ (dw). 


where (dw) is the normalized uniform measure over 0{p). Taking the square under 
etr and factorizing, we obtain an equivalent expression 


(27r/p)-^(^+'^/^2-P/2gxp {~0^} etr 



—uyu r]r] 


(dw). 


Now change the variables from y to {F[,x), where y = FlxH' is the spectral 
decomposition of y, and integrate H out to obtain fj45l) with 


Cp (x) 


^(p+l)/4^p(p-l)/4 p X P 

2po--)/-+^r,(p/2) ("iC 11 
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Here Fp (p/2) is the multivariate Gamma function (see Muirhead (1982), pp 61-63). 

□ 


8.2 Proof of Lemma [4] 


It is sufficient to prove the lemma for SigD, REG and GGA. For PGA and REGq, 
the lemma follows by taking the limits of SigD and REG cases as C 2 —)■ 0. For 
SMD, the lemma then follows by taking the limit of PGA case as Ci —)■ 0, after the 
transformations 9 ^/ch9 and z i—)■ + 1. 

Our proofs of the lemma are very similar for SigD, REG and GGA. Here we 
show only the proof for SigD. First, note that the minimum of Zq over 6 > 0 equals 


K = Cl 


f r + 1 
\r + C 2 


2 


and is achieved at 

e = ep = {c2 + r)/{l- ca). 

Therefore, since (z) is well dehned for z > and since 9p ^ 9 as n,p oo, 
(^o) must be well-dehned for any 9 G (O,0) and for sufficiently large n,p. 
Using an explicit expression for the Stieltjes transform of the limiting spectral 
distribution of the multivariate F matrix, which is given by Bai and Silverstein 
(2006) p.71, we obtain 


(z) 


Further, 


(ci - 1) (ci - C2z) + (1 - ca) ci^ 

2CiZ (ci - C2Z) 

\J ((ci -C 2 )z + Cl (1 - Ci))^ - AciZ (ci - C 2 Z) 
2 ciZ (ci - C2Z) 


(46) 




9r‘i 


(1-I-6*) — CiCa6*2: 

It now takes a direct algebra, which we perform using Maple’s symbolic algebra 
software, to verify that Zq solves equation fl35l) .D 
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8.3 Proof of Lemma [6] 


For SMD, PCA, and SigD, \z — A| is obviously strictly increasing for any A G M 
and as z moves away from zq along /Ci. Therefore, 


2 Refii{z) 


j\n\z- A| dFe (A) 


is strictly increasing. On the other hand, by (ED, Re fjjj (z) is non-decreasing. 
Hence Re / {z) is strictly increasing. 

For REGo and CCA, |^ — A| is strictly increasing for any A > 0 as 2 ; moves 
away from zq along /Ci because the center of the circumference that includes K-i 
is a negative real number. Therefore, Refji{z) is strictly increasing. To show 
that Re//// {z) is strictly increasing too, it is sufficient to prove that Reipj (tj) is 
strictly increasing for j = 0 , 2 . A proof of this fact relies on elementary calculus. 
It is available from the authors upon request. 

For REG, 2 ; moves away from zo along /Ci when r moves away from tq along 
Cl. Using (1^ . flSSp . and (HOD , we obtain 

Refill (t) = -— (— Re r -|- In |r -|- 1| -f- eln |r -|- el -1- elne). 

Cl 

On the other hand, |r -T e| remains constant on Ci whereas both — Re r and |t -|- 1| 
increase as r moves away from tq along Ci. To see that | r -|- 11 indeed increases 
recall that the center —e of the circumference that represents Ci is to the left of 
the point —1. Hence, Re fm (r) is strictly increasing. 

To show that Re /// (r) is strictly increasing too it is sufficient to verify that 


— A| 


Cl (1 - Cl) r (r-Fl) 

- 7, - 

9 c2 t + e 


is strictly increasing for any A from the support of F^. Since |t -|- c| remains con¬ 
stant, it is sufficient to show that 


7 (r, x) = \t {t + 1) - X {t + e)f‘ 


increases as r moves away from tq along Ci for any x = \dc 2 l [ci (1 — Ci)]. 

Parameterize r G Ci as —e -|- pe'°‘,a G [ 0 , 7 r/ 2 ]. Then elementary calculations 
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yield 


7 (r, x) = + {2e — 1 + — 2p^ (2e — 1 + x) cos a 

+ {e — 1)^ + 2 (p^ cos 2a — (2^ — 1 + x) p cos a) e{e — 1) 


so that 

^7 ^ 2p | — (2£ — 1 + x) Tp^ + £ (e — 1)1 + 4p£ (e — 1) cos a) . (47) 

d cos a 

We would like to prove that the derivative d 7 (r, x) /d cos a is negative. As is seen 
from (H7|) . the derivative is decreasing in x and increasing in cos a. Since x > 0 and 
cos a < 1, it is sufficient to show that d 7 (r, 0) /dcosa is negative for cos a = 1. 
We have 


d7 (a 0) 

dcosa 


COS ol=1 


= -2p{2 e-l) 



£(£- 1 ) 


2 e {e - 1) 
2 e-l 


2 e {e - 1) 
2 e-l 



2 


+ 


This is negative because the expression in the second line of the above display is 
positive. Indeed, 


£ (e — 1) {2e — 1)^ — {e — 1)^ = £ (e — 1) > 0. 

To summarize, both Re fjj (r) and Re fm (r) are strictly increasing as r moves 
away from tq along Ci. Hence, the image of Ci, /Ci, is a contour of steep descent of 
— Re f{z) in 2 ;-plane. 

8.4 Values of f{zo) and 6.“^f {zq)/ dz‘^ 

Let us first show that f{zo) = 0. Recall that f{z) = fi + fii{z) + fiii{z). The 
value of fi is given in Table [21 The value of fiii{zo) is straightforward to compute 
using the dehnitions of fm and Zq. The 

2 fn{zo)= /^n(^o-A)dFe(A) 

Jb- 
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takes on three different values: one for SMD, another for PCA and REGq, and the 
third one for SigD, REG, and GGA. 

Lemma 13 For SigD, REG, and CCA, for any 9 G (O, 0 ) and for sufficiently 
large n,p, we have 

2 fii (zq) = 2 In Cl — In 6*-^ In (1 + 6^) — In (ci + C 2 ) H-In [ciZ (6^)]. 

Cl C1C2 C1C2 

(48) 


Proof: We follow the usual strategy of reduction to a contour integral. First 
make the change of variables X = a — /3 cos ip. In order to arrange that A = 6 _ and 
6+ at 93 = 0 and vr respectively, we set 

6+ + 6_ Cl (r^ + Cl) 6+ — 6_ 2rcl 

Q: = -P = - 3 :- = -o- 

2 (ci + C 2 ) 2 (ci + C 2 ) 



We obtain 


2 /// (^ 0 ) 


Cl + C 2 
dvrci 


r-27r 


( 3 "^ sin^ P\yi{zq — a + (3 cos p) 

{a — (3 cos p) (ci — C2Q; + C2/9 cos p) 


dp 


after extending the integral from [0, tt] to [0, 27 r] using the symmetry of the inte¬ 
grand about (^ = TT. Now introduce z = e'"^. Since cosp = {z + z~^) / 2 , we have 
from fH 9 l) the factorizations 


Cl {a — (3 cos p) 


Cl — C2Q; -t- C2/3 cos p 
zq — a + (3 cos p 
q(z) 


^ (r - ci^) (r - ci^ , 

^(r + C2z) (r + C2Z~^) , 

2r 

q{z)q (2;”^) with 

—7— (a/C l/ { 0 ) /9 + rz^ 9 / [cil (6')]) . 
Cl + C2 V / 


Our integral becomes 


‘3'fn {zo) 


-(ci + C2)r^ r {z - z ^)‘^\n{q{z)q{z ^)) dz 

dvri (r — ciz) (r — CiZ~^) (r + C2z) (r + C2Z~^) z 
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The integral has form I = j) \n{q{z)q {z ^)) H (z) z ^dz with H{z) = H {z ^). 
Hence, expanding the logarithm yields two identical terms, so that 


Of i ^ 

° 27ri Jc (r - ciz) (z - cxjr) (r + c^z) {z + c^jr) z ' 

For Q G (O, W) and sufficiently large n, p, we have Q G (O, 0p) with 9p = (c 2 + r) / (1 — C 2 ). 
On the other hand, for 9 G (O, 9pj , the function lug (z) is analytic inside the circle 
\z\ = 1, and so the whole integrand is analytic inside the circle except for simple 
poles a.t z = 0,ci/r and —C 2 /r. The residues at these poles are respectively 


Cl + C2 , Cl Jcil/9 1 - Cl , Cl {1 + 6) , 1 - C2 , Cl 

In —-,-In-, and-In 


C1C2 Cl + C2 Cl y/ 9 Cil C2 \/ Ocil 

and their sum, after collecting terms, yields formula (l48ll .D 

Corollary 14 For PCA and REGq, for any 6 G (O,0) and for sufficiently large 
n, p, we have 


‘2fii (zo) = In Cl - In 6 * 


1 - Cl 

Cl 


In (1 + 0) + 9/ci. 


(50) 


Proof: The corollary is obtained from Lemma [13] by taking the limit as C 2 


O.D 


Corollary 15 For SMD, for any 6 G (O, O') and for sufficiently large n,p, we have 


2 fii{zo) = -\nd + d‘^/2. 


(51) 


Proof: We remarked earlier that SMD is a limit of PCA and REGq as Ci —)■ 0 
after the transformations 6 1 —>• y/cfO and 2 ; 1 —>• y/clz + 1. In particular. 


(SMD) 


lim (4^^^^ - 1 )/\/d and F^^ (A) = lim (^A + 1). 

Cl^O,0l-4 Cl^O 


These equations imply that 


2 /. 


{SMD) ( {SMD)\ 


11 


I = lim 

/ c\^0,9t-^y/cf9 


2 /. 


{PCA) _ 


In 


Using this relationship together with Corollary [TT] yields 2/// (zq) = — In^ + 6*^/2 
for SMD.D 
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Combining equations fHHj) . fl50|) . and flSTj) with the explicit expressions for fj 
and fiii{zo), we obtain the desired result: f {zq) = 0 for all the six cases we 
consider. 

To compute d^f (zq) jdz^, note that —2d^/// ( 2 : 0 ) jdz^ = dmc ( 2 : 0 ) /d 2 ;. There¬ 
fore d? ju{zQ) jdz^ can be directly evaluated using explicit expressions for the 
Stieltjes transforms of the semicircle, Marchenko-Pastur and Wachter distribu¬ 
tions. Formula fl46p gives such an explicit expression for rn^ (z). The explicit 
expressions for (z) and [z) are well known. To perform the necessary 
computations, we use Maple’s symbolic algebra software. Further, using the deh- 
nition of fni{z), we directly evaluate d^fm (zq) jd^. Combining the expressions 
for the second derivatives of fu and fm, we obtain values of the second derivative 
of / reported in Table IHD 


8.5 Proof of Theorem [8] 

First, let us show that 


L, {9-A) 


gjzo) 

V“2d2/(^o)/d2;2 


+ Op {p , 


(52) 


where Op (1) is uniform with rspect to 9 E (O, 6^ — e] . Changing the variable of 
integration in (l43ll from z to (^ = 9z, we obtain 


L, (»i A) = j e-!^K>x(C)dC, (53) 

where 

m = f{C/9)^x{C)=9{CI9)l9, 

and 1C is the image of /Ci U^i under the transformation z ^ C,. The set of possible 
values of 6* is = (O, 0 — e]. 

Using TablelHand the dehnitions of /Ci, Zq, f{z), and g{z), it is straightforward 
to verify that the assumptions A0-A4 of Lemma [7] hold for the integral in fl5^ for 
all the six cases that we consider. The validity of A5 follows from Lemma [TBl given 
below and from the dehnitions of g {z). Let 


A(C) =p jin (C/« - A) d (f (A) - J? (A)) , 


(54) 


SO that A((C) = —2\YLgjj{(/9). 
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Lemma 16 Suppose that the null hypothesis holds, that is 6 q = 0. Then there 
exists a positive constant Ci, such that for a subset Q of C that consists of all 
points whose Euclidean distance from fC is no larger than Ci, we have 

sup|A(C)| = Op(l) 

Cee 

as n, p — oo, where Op(l) is uniform with respect to 9 eVL= (O, 0 — e] . 

Proof: Let us rewrite fl5^ in the following equivalent form 

A(C)=p j lii(l-A9/C)d(f’(A)-f;(A)). 

Statistic A((^) is a special form of a linear spectral statistic 

A(p) = py p (A) d (#(A) - F, (A)) 

studied by Bai and Yao (2005), Bai and Silverstein (2004), and Zheng (2012) for 
the cases of the Semi-circle, Marchenko-Pastur, and Wachter limiting distributions, 
respectively. These papers note that 

A(<p) = (f (0 (0 - "ic (0) 

where 

AK) = I 

are the Stieltjes transforms of F and Fc, and P is a positively oriented contour in 
an open neighborhood of the supports of F and Fc, where p (C) is analytic, that 
encloses these supports. Further, the papers show that if the distance from V to 
the supports of F and Fc stays away from zero with probability approaching one 
as n, p -^-y oo, then 

A(p) = -^ /ptt)Aftt)d5 + Op(l). (55) 

where pM{f) is a truncated version of p [m (^) — rUc (^j that weakly converges to 
a random continuous function on V with Gaussian hnite dimensional distributions. 
Furthermore, Op (1) in (15^ is uniform in ip that are analytic in the open neigh¬ 
borhood of the supports of F and Fc and such that sup^gp \p (^)| < K for some 


40 




constant K. Therefore, for any 5 > 0, there exists i? > 0, snch that 


Pr |A(v 9 )| < 5snp |(^ (01 > 1 - ^ (56) 

V i&v J 

for all n and p. Moreover, constant B does not depend on p. Now, consider a 
family of fnnctions (0 


{^Q,e = In {1 — ^9/0 :( e Q and 6 ^ G hi} . 


By the dehnitions of 0 and hi, there exists an open neighborhood J\f of the snpports 
of F and F^. and a constant Bi, snch that, with probability arbitrarily close to one, 
for snfficiently large n and p, (0 are analytic in J\f for all C ^ © and 9 E fl 
and 

snpsnpsnp ( 0 I < Bi. 
een fee feAf 

Since A{ip^^g) = A(0, we obtain from fl5^ that for any 5 > 0, there exists B 2 > 0 
snch that for snfficiently large n and p, 


Pr 


snpsnp |A(C)| < B 2 
9eQ fee 


>1-5. 


In other words, snp(*g 0 |A((C)| = Op(l) nniformly over 9 G H.D 

Applying Lemma [7] to the integral in fl5^ and nsing the fact that f{zo) = 0, 
we obtain fl52|) . It remains to show that L 2 [9] A) is asymptotically dominated by 
Li {9] A), where 

L 2 {9-A) = L{9;A)-L, {9; A). 

For SMD, PGA, and SigD we have 


L 2 {9;A)\ = 


< 

< 




^-pUi+JiiiF. 


27ri 


'K2UK2 


'’^9i9iii{z)YI 


i=i 




1/2 


dz 


Explicitly evalnating the latter integral and nsing the exact form of p/, available 
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from Table [3l we obtain 


\U (^; A)| < (2zo)"'’^'e-PA»(-o) (i + ^(l)), 

,/wp 

where o(l) does not depend on 6, C = 1 for SMD and PCA, and C = y'ci + C 2 /r 
for SigD. Therefore, 

2C 

1^2 (6'; A)| < exp {-p {\n{2zQ)/2 - fu (^^o))} (1 + o(l)) 

= {-f/'" (^) ' 

where we used the fact that / (^o) = 0. But ln(2^o/ (^o — A)) is positive and 
bounded away from zero uniformly over 6 & (O, 0 — e] with probability arbitrarily 
close to one, for sufficiently large n,p. Hence, there exists a positive constant K 
such that 

2C 

|i2(9;A)|<—e-'''^ (1 + 0(1)) 

with probability arbitrarily close to one for sufficiently large n, p. Combining this 
inequality with fl52|) . we establish Theorem [8] for SMD, PCA, and SigD. 

For REGo, we shall need the following lemma. 

Lemma 17 For sufficiently large n and p, we have 

|oEi(&-s;d'ii^)| < 4v^|exp{-m<y9o(fo)}| (57) 

for any z and any 9 > 0. 

Proof: We use the identity (see formula 9.6.3 in Abramowitz and Stegun 
(1964)) 

Im (C) = for - TT < argC < 7r/2, 

where Jm{') is the Bessel function. The identity and (1221) imply that 

oFi {h - s] Tii^) = P (m + 1) ( 1270 ^^^^ . (58) 
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On the other hand, for any ( and any positive K, 


\Jk{kc)\ < |i + 


sin Ktt 
Ktt 


Cexp|v/l-e}| 
1 + Vi - C' " J 


(59) 


(see Watson (1944), p.270). The latter inequality, equation (l58|) . and the Stirling 
formula for T (m + 1 ) imply that fl57)) holds for sufficiently large m, for any 2 ; and 
6 > 0. The constant 4 on the right hand side of (|57|l is not the smallest possible 
one, but it is sufficient for our purposes.□ 

Using inequality fl57)) . we obtain for REGq 



p 

exp {-rrnpo{to)} JJ (^ - dz 

i=i 


(60) 


It is straightforward to verify that Re</9o(to) is strictly increasing as z is moving 
along /C 2 towards — 00 . Therefore, for any z e /C 2 , 


Re(po {to{z)) > Re(po{to{z)), 


where z = zi+i {zo — zi) is the point of IC 2 where IC 2 meets 1C\. The latter inequality 
together with (|60|l yields 


IL 2 (»i A)| < \mi (?)l / il 




A, 


A, 


1/2 


|d^| 


Since, for some constant ri. Re / (z) > f (zq) + ti = ti and since, by Lemma fT 6 l 
dgii (z) = Op (1) uniformly over 9 G (O, (9 — e] , we obtain 


r ^ 

\L2 (0; A)| < / TT 


2 ; — A,- 


2 ; — A,' 


1/2 


|dz|Op(l). 


(61) 


Note that for any z G /C 2 and any j = \{z — Xj) / {z — \j)\ < 1 and 

I 2 ; — AjI > \z\ . Further, since Zq < \z\ and with probability arbitrary close to 
one, for sufficiently large n and p, Ai < zq, we have \z — A/ < Iz — zqI <2 \z\. 
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Thus, for j9 > 4, we have 



Combining this with flhTll and noting that gi\z\ = O (1) uniformly over 9 G 
(O, 0 — e] , we obtain 

1^2 (6*; A) I < (1), (62) 

where Op (1) is uniform with respect to 6* G (O, 0 — e]. Theorem|8]for REGq follows 
from the latter equality and fl52l) . 

For REG and GGA, the Theorem follows from (l5^ and inequalities 

|A2(0;A)| <pe-^’"^Op(l), (63) 

where T 2 is a positive constant. We obtain fl63ll by combining the method used 
to derive fl6^ with upper bounds on iFi and 2 T’i, which we establish using the 
integral representations (125|) . We omit details to save space.□ 


References 

[1] Abramowitz, M., and Stegun, I. A. (eds.) (1964) Handbook of Mathematical 
Functions with Formulas, Graphs, and Mathematical Tables, National Bureau 
of Standards. 

[2] Bai, Z.D. (1999) “Methodologies in Spectral Analysis of Large Dimensional 
Random Matrices, a Review,” Statistica Sinica 9, 611-677. 

[3] Bai, Z.D. and J.W. Silverstein (2004) “GLT for Linear Spectral Statistics of 
Large-Dimensional Sample Govariance Matrices”, Annals of Probability 32, 
553-605. 

[4] Bai, Z.D., and J.W. Silverstein (2006) Spectral Analysis of Large Dimensional 
Random Matrices, Science Press, Beijing. 

[5] Baik, J. and J.W. Silverstein. (2006) “Eigenvalues of large sample covariance 
matrices of spiked population models”. Journal of Multivariate Analysis 97, 
1382-1408. 


44 





[6] Bai, Z.D., and Yao, J. (2005) “On the Convergence of the Spectral Empirical 
Process of Wigner matrices”, Bernoulli 11(6), 1059-1092. 

[7] Bao, Z., J. Hu, G. Pan, and W. Zhou (2014) “Canonical correlation coefficients 
of high-dimensional normal vectors: hnite rank case,” arXiv 1407.7194 

[8] Dharmawansa, P. and I. M. Johnstone (2014) “Joint density of eigenvalues in 
spiked multivariate models,” Stat 3, no. 1, 240-249. 

[9] Dharmawansa, P., I.M. Johnstone, and A. Onatski (2014a) “Local Asymp¬ 
totic Normality of the spectrum of high-dimensional spiked F-ratios,” arXiv 
1411.3875 

[10] Fyodorov, Y. V., B.A. Khoruzhenko, and N.J. Simm (2013) “Fractional Brow¬ 
nian motion with Hurst index H=0 and the Gaussian Unitary Ensemble,” 
arXiv 1312.0212vl. 

[11] James, A.T. (1955) “The Non-central Wishart Distribution,” Proceedings of 
the Royal Society of London. Series A, Mathematical and Physical Sciences 
229, pp. 364-366. 

[12] James, A. T. (1964) “Distributions of matrix variates and latent roots derived 
from normal samples,” Annals of Mathematical Statistics 35, 475-501. 

[13] Luke, Y.L. (1969) The Special Functions and Their Approximations, Volume 
1. Academic Press. 

[14] Mai'da, M. (2007) “Large deviations for the largest eigenvalue of rank one 
deformations of Gaussian ensembles,” Electronic Journal of Probability 12, 
1131-1150. 

[15] Mehta, L.M. (2004) Random Matrices. Third edition. Elsevier Academic 
Press. 

[16] Montanari, A., Reichman, D., and Zeitouni, O. (2014) “On the limitation 
of spectral methods: From the Gaussian hidden clique problem to rank one 
perturbations of Gaussian tensors,” arXiv: 1411.6149vl. 

[17] Muirhead, R.J. (1978) “Latent roots and matrix variates: a review of some 
asymptotic results,” Annals of Statistics 6, 5-33. 


45 



[18] Muirhead, R. J. (1982) Aspects of Multivariate Statistical Theory. John Wiley 
& Son, Hoboken, New Jersey. 

[19] Nadakuditi, R.R. and J. W. Silverstein (2010) “Fundamental Limit of Sample 
Generalized Eigenvalue Based Detection of Signals in Noise Using Relatively 
Few Signal-Bearing and Noise-Only Samples,” IEEE Journal of Selected Top¬ 
ics in Signal Processing 4 (3), 468-480. 

[20] Olver, F.W.J. (1997) Asymptotics and Special Functions, A K Peters, Natick, 
Massachusetts. 

[21] Olver, F.W.J., D. W. Lozier, R.F.Boisvert, and C.W. Clark (eds.) (2010) 
NIST Handbook of Mathematical Functions, Cambridge University Press, 
Cambridge. 

[22] Onatski, A. (2007) “Asymptotics of the principal components estimator of 
large factor models with weak factors and i.i.d. Gaussian noise,” manuscript. 
University of Cambridge. Available at http://www.econ.cam.ac.uk /people 
/faculty /ao319 /pubs /inference45a.pdf. 

[23] Onatski, A., Moreira, M.J., and Hallin, M. (2013) “Asymptotic power of 
sphericity tests for high-dimensional data”. Annals of Statistics 41, 1204-1231. 

[24] Paris, R.B. (2013a) “Asymptotics of the Gauss Hypergeometric Function with 
Large Parameters, I,” Journal of Classical Analysis 2, 183-203. 

[25] Paris, R.B. (2013b) “Asymptotics of the Gauss Hypergeometric Function with 
Large Parameters, H,” Journal of Classical Analysis 3, 1-15. 

[26] Passemier, D., McKay, M.R., and Y. Chen (2014) “Asymptotic Lin¬ 
ear Spectral Statistics for Spiked Hermitian Random Matrix Models”, 
arXiv: 1402.6419vl. 

[27] Silverstein, J.W. (1985) “The Limiting Eigenvalue Distribution of a Multi¬ 
variate F Matrix,” SIAM Journal of Mathematical Analysis 16 (3), 641-646. 

[28] van der Vaart, A.W. (1998) Asymptotic Statistics, Cambridge University 
Press. 

[29] Wachter, K. (1980) “The limiting empirical measure of multiple discriminant 
ratios,” The Annals of Statistics 8, 937-957. 


46 



[30] Watson, G.N. (1944) A Treatise on the Theory of Bessel Funetions, Cambridge 
University Press. 

[31] Yin, Y.Q, Bai, Z.D., and Krishnaiah, P.R. (1983) “Limiting Behavior of the 
Eigenvalues of a Multivariate F Matrix,” Journal of Multivariate Analysis 13 
(4), 508-516. 

[32] Zheng, S. (2012) “Central limit theorems for linear spectral statistics of large 
dimensional F-matrices”, Annales de ITnstitut Henry Poincare - Probabilites 
et Statistiques 48, 444-476. 


47 



